Goto

Collaborating Authors

 efficient neural architecture search


MOTE-NAS: Multi-Objective Training-based Estimate for Efficient Neural Architecture Search

Neural Information Processing Systems

Neural Architecture Search (NAS) methods seek effective optimization toward performance metrics regarding model accuracy and generalization while facing challenges regarding search costs and GPU resources. Recent Neural Tangent Kernel (NTK) NAS methods achieve remarkable search efficiency based on a training-free model estimate; however, they overlook the non-convex nature of the DNNs in the search process. In this paper, we develop Multi-Objective Training-based Estimate (MOTE) for efficient NAS, retaining search effectiveness and achieving the new state-of-the-art in the accuracy and cost trade-off. To improve NTK and inspired by the Training Speed Estimation (TSE) method, MOTE is designed to model the actual performance of DNNs from macro to micro perspective by draw loss landscape and convergence speed simultaneously. Using two reduction strategies, the MOTE is generated based on a reduced architecture and a reduced dataset.


Efficient Neural Architecture Search: A Broad Version

Ding, Zixiang, Chen, Yaran, Li, Nannan, Zhao, Dongbin, Chen, C. L. Philip

arXiv.org Machine Learning

Efficient Neural Architecture Search (ENAS) achieves novel efficiency for learning architecture with high-performance via parameter sharing, but suffers from an issue of slow propagation speed of search model with deep topology. In this paper, we propose a Broad version for ENAS (BE-NAS) to solve the above issue, by learning broad architecture whose propagation speed is fast with reinforcement learning and parameter sharing used in ENAS, thereby achieving a higher search efficiency. In particular, we elaborately design Broad Convolutional Neural Network (BCNN), the search paradigm of BENAS with fast propagation speed, which can obtain a satisfactory performance with broad topology, i.e. fast forward and backward propagation speed. The proposed BCNN extracts multi-scale features and enhancement representations, and feeds them into global average pooling layer to yield more reasonable and comprehensive representations so that the achieved performance of BCNN with shallow topology can be promised. In order to verify the effectiveness of BENAS, several experiments are performed and experimental results show that 1) BENAS delivers 0.23 day which is 2x less expensive than ENAS, 2) the architecture learned by BENAS based small-size BCNNs with 0.5 and 1.1 millions parameters obtain state-of-the-art performance, 3.63% and 3.40% test error on CIFAR-10, 3) the learned architecture based BCNN achieves 25.3% top-1 error on ImageNet just using 3.9 millions parameters. 1 Introduction Recently, Neural Architecture Search (NAS) [25] which automates the process of model designing is gaining around in past several years. However, early approaches [20, 25, 26] suffer from the issue of inefficiency. To solve this issue, some one-shot approaches [1, 6, 11, 14, 19] are proposed.


ShuffleNASNets: Efficient CNN models through modified Efficient Neural Architecture Search

Laube, Kevin Alexander, Zell, Andreas

arXiv.org Machine Learning

Abstract--Neural network architectures found by sophistic search algorithms achieve strikingly good test performance, surpassing most human-crafted network models by significant margins. Although computationally efficient, their design is often very complex, impairing execution speed. Additionally, finding models outside of the search space is not possible by design. While our space is still limited, we implement undiscoverable expert knowledge into the economic search algorithm Efficient Neural Architecture Search (ENAS), guided by the design principles and architecture of ShuffleNet V2. While maintaining baselinelike 2.85%test error on CIFAR-10, our ShuffleNASNets are significantly less complex, require fewer parameters, and are two times faster than the ENAS baseline in a classification task.


carpedm20/ENAS-pytorch

@machinelearnbot

ENAS reduce the computational requirement (GPU-hours) of Neural Architecture Search (NAS) by 1000x via parameter sharing between models that are subgraphs within a large computational graph. More configurations can be found here. Efficient Neural Architecture Search (ENAS) is composed of two sets of learnable parameters, controller LSTM θ and the shared parameters ω. These two parameters are alternatively trained and only trained controller is used to derive novel architectures. Controller LSTM decide 1) what activation function to use and 2) which previous node to connect.


Efficient Neural Architecture Search via Parameter Sharing

Pham, Hieu, Guan, Melody Y., Zoph, Barret, Le, Quoc V., Dean, Jeff

arXiv.org Machine Learning

We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Thanks to parameter sharing between child models, ENAS is fast: it delivers strong empirical performances using much fewer GPU-hours than all existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On the Penn Treebank dataset, ENAS discovers a novel architecture that achieves a test perplexity of 55.8, establishing a new state-of-the-art among all methods without post-training processing. On the CIFAR-10 dataset, ENAS designs novel architectures that achieve a test error of 2.89%, which is on par with NASNet (Zoph et al., 2018), whose test error is 2.65%.